28 research outputs found

    Safe, Flexible Aliasing with Deferred Borrows

    Get PDF
    In recent years, programming-language support for static memory safety has developed significantly. In particular, borrowing and ownership systems, such as the one pioneered by the Rust language, require the programmer to abide by certain aliasing restrictions but in return guarantee that no unsafe aliasing can ever occur. This allows parallel code to be written, or existing code to be parallelized, safely and easily, and the aliasing restrictions also statically prevent a whole class of bugs such as iterator invalidation. Borrowing is easy to reason about because it matches the intuitive ownership-passing conventions often used in systems languages. Unfortunately, a borrowing-based system can sometimes be too restrictive. Because borrows enforce aliasing rules for their entire lifetimes, they cannot be used to implement some common patterns that pointers would allow. Programs often use pseudo-pointers, such as indices into an array of nodes or objects, instead, which can be error-prone: the program is still memory-safe by construction, but it is not logically memory-safe, because an object access may reach the wrong object. In this work, we propose deferred borrows, which provide the type-safety benefits of borrows without the constraints on usage patterns that they otherwise impose. Deferred borrows work by encapsulating enough state at creation time to perform the actual borrow later, while statically guaranteeing that the eventual borrow will reach the same object it would have otherwise. The static guarantee is made with a path-dependent type tying the deferred borrow to the container (struct, vector, etc.) of the borrowed object. This combines the type-safety of borrowing with the flexibility of traditional pointers, while retaining logical memory-safety

    RowHammer: Reliability Analysis and Security Implications

    Full text link
    As process technology scales down to smaller dimensions, DRAM chips become more vulnerable to disturbance, a phenomenon in which different DRAM cells interfere with each other's operation. For the first time in academic literature, our ISCA paper exposes the existence of disturbance errors in commodity DRAM chips that are sold and used today. We show that repeatedly reading from the same address could corrupt data in nearby addresses. More specifically: When a DRAM row is opened (i.e., activated) and closed (i.e., precharged) repeatedly (i.e., hammered), it can induce disturbance errors in adjacent DRAM rows. This failure mode is popularly called RowHammer. We tested 129 DRAM modules manufactured within the past six years (2008-2014) and found 110 of them to exhibit RowHammer disturbance errors, the earliest of which dates back to 2010. In particular, all modules from the past two years (2012-2013) were vulnerable, which implies that the errors are a recent phenomenon affecting more advanced generations of process technology. Importantly, disturbance errors pose an easily-exploitable security threat since they are a breach of memory protection, wherein accesses to one page (mapped to one row) modifies the data stored in another page (mapped to an adjacent row).Comment: This is the summary of the paper titled "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors" which appeared in ISCA in June 201

    CHIPPER: A Low-complexity Bufferless Deflection Router

    No full text
    As Chip Multiprocessors (CMPs) scale to tens or hundreds of nodes, the interconnect becomes a significant factor in cost, energy consumption and performance. Recent work has explored many design tradeoffs for networks-on-chip (NoCs) with novel router architectures to reduce hardware cost. In particular, recent work proposes bufferless deflection routing to eliminate router buffers. The high cost of buffers makes this choice potentially appealing, especially for low-to-medium network loads. However, current bufferless designs usually add complexity to control logic. Deflection routing introduces a sequential dependence in port allocation, yielding a slow critical path. Explicit mechanisms are required for livelock freedom due to the non-minimal nature of deflection. Finally, deflection routing can fragment packets, and the reassembly buffers require large worst-case sizing to avoid deadlock, due to the lack of network backpressure. The complexity that arises out of these three problems has discouraged practical adoption of bufferless routing. To counter this, we propose CHIPPER (Cheap-Interconnect Partially Permuting Router), a simplified router microarchitecture that eliminates in-routerbuffers and the crossbar. We introduce three key insights: first, that deflection routing port allocation maps naturally to a permutation network within the router; second, that livelock freedom requires only an implicit token-passing scheme, eliminating expensive age-based priorities; and finally, that flow control can provide correctness in the absence of network backpressure, avoiding deadlock and allowing cache miss buffers (MSHRs) to be used as reassembly buffers. Using multiprogrammed SPEC CPU2006, server, and desktop application workloads and SPLASH-2 multithreaded workloads, we achieve an average 54.9% network power reduction for 13.6% average performance degradation (multipro-grammed) and 73.4% power reduction for 1.9% slowdown (multithreaded), with minimal degradation and large power savings at low-to-medium load. Finally, we show 36.2% router area reduction relative to buffered routing, with comparable timing.</p

    The Heterogeneous Block Architecture

    No full text
    <p>This paper makes two observations that lead to a new heterogeneous core design. First, we observe that most serial code exhibits fine-grained heterogeneity: at the scale of tens or hundreds of instructions, regions of code fit different microarchitectures better (at the same point or at different points in time). Second, we observe that by grouping contiguous regions of instructions into blocks that are executed atomically, a core can exploit this fine-grained heterogeneity: atomicity allows each block to be executed independently on its own execution backend that fits its characteristics best. Based on these observations, we propose a fine-grained heterogeneous core design, called the heterogeneous block architecture (HBA), that combines heterogeneous execution backends into one core. HBA breaks the program into blocks of code, determines the best backend for each block, and specializes the block for that backend. As an example HBA design, we combine out-of-order, VLIW, and in-order backends, using simple heuristics to choose backends for different dynamic instruction blocks. Our extensive evaluations compare this example HBA design to multiple baseline core designs (including monolithic out-of-order, clustered out-of-order, in-order and a state-of-the-art heterogeneous core design) and show that it provides significantly better energy efficiency than all designs at similar performance.</p

    HAT: Heterogeneous Adaptive Throttling for On-Chip Networks

    No full text
    Abstract—The network-on-chip (NoC) is a primary shared resource in a chip multiprocessor (CMP) system. As core counts continue to increase and applications become increasingly data-intensive, the network load will also increase, leading to more congestion in the network. This network congestion can degrade system performance if the network load is not appropriately controlled. Prior works have proposed sourcethrottling congestion control, which limits the rate at which new network traffic (packets) enters the NoC in order to reduce congestion and improve performance. These prior congestion control mechanisms have shortcomings that significantly limit their performance: either 1) they are not application-aware, but rather throttle all applications equally regardless of applications’ sensitivity to latency, or 2) they are not network-loadaware, throttling according to application characteristics bu
    corecore